Your browser doesn't support javascript.
loading
Mostrar: 20 | 50 | 100
Resultados 1 - 4 de 4
Filtrar
Mais filtros










Base de dados
Intervalo de ano de publicação
1.
J Acoust Soc Am ; 155(1): 381-395, 2024 01 01.
Artigo em Inglês | MEDLINE | ID: mdl-38240668

RESUMO

Auditory perceptual evaluation is considered the gold standard for assessing voice quality, but its reliability is limited due to inter-rater variability and coarse rating scales. This study investigates a continuous, objective approach to evaluate hoarseness severity combining machine learning (ML) and sustained phonation. For this purpose, 635 acoustic recordings of the sustained vowel /a/ and subjective ratings based on the roughness, breathiness, and hoarseness scale were collected from 595 subjects. A total of 50 temporal, spectral, and cepstral features were extracted from each recording and used to identify suitable ML algorithms. Using variance and correlation analysis followed by backward elimination, a subset of relevant features was selected. Recordings were classified into two levels of hoarseness, H<2 and H≥2, yielding a continuous probability score y∈[0,1]. An accuracy of 0.867 and a correlation of 0.805 between the model's predictions and subjective ratings was obtained using only five acoustic features and logistic regression (LR). Further examination of recordings pre- and post-treatment revealed high qualitative agreement with the change in subjectively determined hoarseness levels. Quantitatively, a moderate correlation of 0.567 was obtained. This quantitative approach to hoarseness severity estimation shows promising results and potential for improving the assessment of voice quality.


Assuntos
Disfonia , Rouquidão , Humanos , Rouquidão/diagnóstico , Reprodutibilidade dos Testes , Qualidade da Voz , Fonação , Acústica , Acústica da Fala , Medida da Produção da Fala
2.
J Voice ; 2023 Aug 28.
Artigo em Inglês | MEDLINE | ID: mdl-37648625

RESUMO

OBJECTIVE: The first goal of this study was to investigate the coverage of laryngeal structures using two potential administration techniques for synthetic mucus: inhalation and lozenge ingestion. As a second research question, the study investigated the potential effects of these techniques on standardized voice assessment parameters. METHODS: Fluorescein was added to throat lozenges and to an inhalation solution to visualize the coverage of laryngeal structures through blue light imaging. The study included 70 vocally healthy subjects. Fifty subjects underwent administration via lozenge ingestion and 20 subjects performed the inhalation process. For the first research question, the recordings from the blue light imaging system were categorized to compare the extent of coverage on individual laryngeal structures objectively. Secondly, a standardized voice evaluation protocol was performed before and after each administration to determine any measurable effects of typical voice parameters. RESULTS: The administration via inhalation demonstrated complete coverage of all laryngeal structures, including the vocal folds, ventricular folds, and arytenoid cartilages, as visualized by the fluorescent dye. In contrast, the application of the lozenge predominantly covered the pharynx and laryngeal surface toward the aryepiglottic fold, but not the inferior structures. All in all, the comparison before and after administration showed no clear effect, although a minor deterioration of the acoustic signal was noted in the shimmer and cepstral peak prominence after the inhalation. CONCLUSIONS: Our findings indicate that the inhalation process is a more effective technique for covering deeper laryngeal structures such as the vocal folds and ventricular folds with synthetic mucus. This knowledge enables further in vivo studies on the role of laryngeal mucus in phonation in general, and how it can be substituted or supplemented for patients with reduced glandular activity as well as for heavy voice users.

3.
J Voice ; 2023 Feb 09.
Artigo em Inglês | MEDLINE | ID: mdl-36774264

RESUMO

OBJECTIVES: The Nyquist plot provides a graphical representation of the glottal cycles as elliptical trajectories in a 2D plane. This study proposes a methodology to parameterize the Nyquist plot with application to support the quantitative analysis of voice disorders. METHODS: We considered high-speed videoendoscopy recordings of 33 functional dysphonia (FD) patients and 33 normophonic controls (NC). Quantitative analysis was performed by computing four shape-based parameters from the Nyquist plot: Variability, Size (Perimeter and Area), and Consistency. Additionally, we performed automatic classification using a linear support vector machine and feature importance analysis by combining the proposed features with state-of-the-art glottal area waveform (GAW) parameters. RESULTS: We found that the inter-cycle variability was significantly higher in FD patients compared to NC. We achieved a classification accuracy of 83% when the top 30 most important features were used. Furthermore, the proposed Nyquist plot features were ranked in the top 12 most important features. CONCLUSIONS: The Nyquist plot provides complementary information for subjective and objective assessment of voice disorders. On the one hand, with visual inspection it is possible to observe intra- and inter-glottal cycle irregularities during sustained phonation. On the other hand, shaped-based parameters allow quantifying such irregularities and provide complementary information to state-of-the-art GAW parameters.

4.
Appl Sci (Basel) ; 12(19)2022 Oct.
Artigo em Inglês | MEDLINE | ID: mdl-37583544

RESUMO

Endoscopic high-speed video (HSV) systems for visualization and assessment of vocal fold dynamics in the larynx are diverse and technically advancing. To consider resulting "concepts shifts" for neural network (NN)-based image processing, re-training of already trained and used NNs is necessary to allow for sufficiently accurate image processing for new recording modalities. We propose and discuss several re-training approaches for convolutional neural networks (CNN) being used for HSV image segmentation. Our baseline CNN was trained on the BAGLS data set (58,750 images). The new BAGLS-RT data set consists of additional 21,050 images from previously unused HSV systems, light sources, and different spatial resolutions. Results showed that increasing data diversity by means of preprocessing already improves the segmentation accuracy (mIoU + 6.35%). Subsequent re-training further increases segmentation performance (mIoU + 2.81%). For re-training, finetuning with dynamic knowledge distillation showed the most promising results. Data variety for training and additional re-training is a helpful tool to boost HSV image segmentation quality. However, when performing re-training, the phenomenon of catastrophic forgetting should be kept in mind, i.e., adaption to new data while forgetting already learned knowledge.

SELEÇÃO DE REFERÊNCIAS
DETALHE DA PESQUISA
...